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ABSTRACT 


Credit card plays a very vital role in today's economy and the usage of credit 
cards has dramatically increased. Credit card has become one of the most 
common method of payment for both online and offline as well as for regular in 
purchases of a common man. It is very necessary to distinguish fraudulent 
credit card transactions by the credit card organizations so their clients are 
not charged for the purchases that they didn't make. Despite the fact that using and 
credit card gives huge benefits when used responsibly carefully and however 
significant credit and financial damages could be caused by fraudulent 
activities as well. Numerous methods have been proposed to stop these 
fraudulent activities. The project illustrates the model of a dataset to predict 
fraud transactions using machine learning. The model then detects if it is a 
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fraudulent or a genuine transaction. The model also analyses and pre- 


processes the dataset along with deployment of multiple anomaly detection 
using algorithms such as Local forest outlier and Isolation forest. 
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I. INTRODUCTION 

Fraud can be defined as criminal deception which focuses on 
a monetary or personal gain|1], or which intends to harm or 
destroy another person while not essentially resulting to 
direct legal consequences. Credit Card Fraud can be defined 
as a case where an individual makes use of someone else’s 
credit card for their private reasons while the owner and the 
card issuing authorities are not aware of the situation that 
the credit card is being utilized by another person. 


Fraud detection systems can be utilized when the fraudsters 

excel the fraud interference systems and process a 
fraudulent transaction. Due to rise in the E-Commerce 
sector, there has been a rapid growth in the use of credit 
cards for online shopping as well as normal purchases which 
has increased the rates of frauds that are related to credit 
cards usage. 


Fraud detection is conducted by monitoring the behaviour of 
different users and detected any undesirable changes from it. 
There has been a growing amount of monetary losses due to 
credit card and several papers reported vast amounts of 
losses in different countries [2-4]. 


Many anomaly detection techniques, supervised and 
unsupervised, are applied to find the fraud data involved in 
the transactions. The supervised techniques like SVM, 
Decision trees, KNN, logistic regression and others offer 
better results and can solve the issue of detecting fraud to an 
extent [5]. Yet, these methods need labelled data to create 
the classifier with fraudulent and _ non-fraudulent 
behaviours. In unsupervised technique the data does not 
have to be labelled. It is based on the fact that fraudulent 
behaviour will act very differently than normal. Decision 
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trees, logistic regression, neural networks, Support Vector 
Machines, and nearest neighbour algorithms are some of 
most commonly used methods. The objective of this model is 
to detect if any of the latest transactions are fraudulent or 
not by using algorithms like isolation forest, Local Outlier 
Factor, Support Vector Machine which helps in reducing the 
number of false positives and detecting the highest number 
of fraud in the transactions by also checking the accuracy 
that each algorithm provides. 


II. PROBLEM DEFINITION 
Credit card frauds are of three types :traditional card related 
frauds, merchant related fraud and Internet frauds[6] . 


Research has been done on many models and several 
techniques has been found to prevent and detect credit card 
frauds. 


Some credit card fraud transaction datasets can be 
imbalanced. An accurate fraud detection system must be 
capable of detecting or identifying the fraudulent transaction 
accurately and should make the detection viable in real-time 
transactions. Fraud detection are of two types which are 
anomaly detection and misuse detection. 


In Anomaly detection the normal transaction is trained and 
the fraudulent data is identified using various techniques. 
Contradictorily, in a misuse fraud detection labelled 
transactions are used as normal or fraudulent transactions. 


So, the misuse detection system comes under supervised 
learning and anomaly detection under of unsupervised 
learning [7]. This model can be then used to detect if 
transaction is fraudulent or not. 
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III. WORKING METHODOLOGY 

The dataset is imported and pre-processed. Exploratory data 
analysis is performed to the classes with respect to its 
frequency. The result shows the number of normal and fraud 
transactions that are present in the dataset. The data is 
analysed in terms of amount and time. The figure obtained 
shows that amountis small for fraud transactions but there a 
lot of fraud transactions in respect to time. A sample of data 
is taken to predict the number of fraudulent and normal 
transactions. Dependent and independent variables are 
created to apply the model. If the column name is not ‘class’ 
it is considered as independent feature and vice versa. 
Finally, the model prediction is done using isolation forest 
algorithm and local outlier factor. And the result predicts 
that these two algorithms outperforms SVM to separate 
outliers. 


V. RESULT AND DISCUSSION 


IV. WORKING FLOW 

Import the dataset and perform the data pre-processing 
steps. Perform data analysis to find the number of class with 
respect to frequency. From the analysis performed, the 
number of normal transaction and the fraudulent ones are 
obtained where the normal transactions are high compared 
to the fraudulent ones. The value 1 depicts fraudulent 
transactions and 0 depicts normal transaction. The dataset is 
then checked for the fraudulent transactions in terms of 
amount and time. A sample of data is taken to determine 
how many are valid and fraud cases. Then the independent 
and dependent features are created to apply the model. X 
and Y are created where X is taken as dependent feature and 
Y is taken as independent feature. Model prediction is done 
using isolation forest algorithm and local outlier factor 
algorithm. The accuracy score and classification report is 
printed. 


Fig 1 shows the number of classes with respect to frequency, It means that the normal transactions are more than 2,50,000 
whereas the fraudulent transactions are very less in the dataset 


Fig 2 shows the fraudulent data with respect to amount and the output shows that the amount is small for fraud transactions. 
Fig 3 shows transactions with respect to time and the output shows that there are a lot of transactions with respect to time. 


Fig 4 shows the output of fraudulent data in the dataset. The isolation forest algorithm and Local Outlier Factor algorithm 


outperforms the support vector machine algorithm 
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Amount per transaction by class 
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Time of transaction vs Amount by class 
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Fig 3 
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Fig 4: OUTPUT 
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73 errors has been detected using the isolation forest algorithm and it has an accuracy of 99.74% . Local Outlier Factor has 
detected 97 errors with 99.65% accuracy, whereas Support vector Machine detected 8516 errors accuracy with 70.09%. When 
comparing error precision & recall for 3 models, the Isolation Forest performed much better than the Local Outlier Factor and 
the detection of fraud cases is around 27 % Local Outlier Factor detection rate is 2 % and Support Vector Machine is 0%. So 
overall Isolation Forest Method performed much better in determining the fraud cases which is around 30%. 


VI. CONCLUSION 

The In this paper an analysis of credit card fraud 
identification was described on a publicly available dataset 
utilizing Machine Learning techniques such as Isolation 
Forest algorithm and Local Outlier Factor. The result has 
Shown that the isolated forest is very efficient and 
outperforms in detecting anomalies in the case of the credit 
card. The use of this algorithm in credit card fraud detection 
system results in detecting or predicting the fraud probably 
in a very short span of time after the transactions has been 
made. This will eventually prevent the banks and customers 
from great losses and also will reduce risks. 
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